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The detection of compact sources embedded in a background is a very common 
problem in many fields of Astronomy. In these lecture notes we present a 
review of different techniques developed for the detection and extraction of 
compact sources, with a especial focus on their application to the held of 
the cosmic microwave background radiation. In particular, we will consider 
the detection of extragalactic point sources and the thermal and kinematic 
Sunyaev-Zeldovich effects from clusters of galaxies. 


1 Introduction 

Observations of an astrophysical signal in the sky are usually corrupted by 
some level of contamination (called noise or background), due to other astro- 
physical emissions and/or to the detector itself. A common situation is that 
the signals of interest are spatially well-localised, i.e. each of them covers only 
a small fraction of the image, but we do not know a priori its position and/or 
its amplitude. Some examples are the detection of extragalactic sources in 
cosmic microwave background (CMB) observations (see Fig. the identih- 
cation of local features (emission or absorption lines) in noisy one-dimensional 
spectra or the detection of objects in X-ray images. It is clear that our ability 
to extract all the useful information from the image will critically depend on 
our capacity to disentangle the signal(s) of interest from the background. 

The process to detect a localised signal in a given image usually involves 
three different steps, which are not necessarily independent: 

1.- Processing: some processing of the data (commonly linear hltering) is 
usually performed in order to amplify the searched signal over the background. 
This is an important step because in many cases the signals are relatively weak 
with respect to the background and it becomes very difficult to detect them 
in the original image. This is illustrated in Figure |2| the top panel shows a 
simulation of white noise where a source with a Gaussian prohle has been 
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Fig. 1. Simulation of the 44 GHz Planck frequency channel in a small patch of the 
sky, containing CMB, Galactic foregrounds, extragalactic point sources and instru¬ 
mental noise. The point sources can be seen as localised objects embedded in the 
background. The Planck Mission m is a satellite of the European Space Agency to 
be launched in 2007 that will provide with multifrequency observations of the whole 
sky with unprecedented resolution and sensitivity. 


added in the centre of the map; the bottom panel gives the same simulation 
after filtering with the so-called matched filter. It becomes apparent that the 
source was hidden in the original image whereas it has been enhanced in the 
filtered image. 

2. - Detection: we need a detection criterion, the detector, to decide if some 
structure in the image is actually a real signal or if it is due to the background. 
A very simple and widely used detector in Astronomy is thresholding: if the 
intensity of the image is above a given value (e.g. 5 (t, where a is the dispersion 
of the map), a detection of the signal is accepted, otherwise one assumes that 
only background is present. In the example of Fig. [3 we see that several peaks 
appear in the filtered image (right panel) but only one of them is above the 
considered threshold va. Therefore, in this case, we would accept only the 
highest peak as a true signal. Note that thresholding uses only the intensity 
of the data to make the decision, however other useful information could also 
be included in order to improve the detector (e.g. curvature, size, etc.). 

3. - Estimation: a procedure must be established to estimate the parameters 
(amplitude, size, position...) characterising the detected signal. For instance, 
a simple possibility is to estimate the required parameters by fitting the signal 
to its theoretical profile. 
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Fig. 2. This example illustrates the importance of hltering. A source with a Gaussian 
profile has been placed in the centre of the image in a background of white noise. 
The source can not be distinguished in the original image (left panel), however, after 
filtering (right panel), the source is enhanced over the background fluctuations. 


The aim of these lecture notes is to present the problem of the extrac¬ 
tion of localised signals (compact sources) in the context of CMB Astron¬ 
omy and to review some of the methods developed to deal with it. In sec¬ 
tion [3 we outline the problem of component separation in CMB observations. 
Section |3 reviews some of the techniques developed for extraction of point 
sources, including, among others, the matched filter and the Mexican Hat 
Wavelet. Sections 0] and 0 deal with the extraction of the thermal and kine¬ 
matic Sunyaev-Zeldovich effects in multifrequency microwave observations, 
respectively. Section El briefly discusses some techniques for the extraction 
of statistical information from undetected sources. Finally, in section [3 we 
present our conclusions. 
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2 The microwave sky and the problem of component 
separation 


Microwave observations contain not only the cosmological signal but also 
Galactic foregrounds, thermal and kinetic Sunyev-Zeldovich (SZ) effects from 
clusters of galaxies and emission from extragalactic point sources |T^ 1551 151 
m . In addition, they also contain some level of noise coming from the detector 
itself. In order to recover all the wealth of information encoded in the CMB 
anisotropies, it is crucial to separate the cosmological signal from the rest of 
the components of the sky. Moreover, the foregrounds themselves contain very 
valuable information about astrophysical phenomena |28|. Therefore, the de¬ 
velopment of tools to reconstruct the different components of the microwave 
sky is of great interest, not only to clean the CMB signal but also to recover 
all the useful information present in the foregrounds. 

The main Galactic foregrounds are the synchrotron, free-free and thermal 
dust emissions. The synchrotron emission is due to relativistic electrons ac¬ 
celerated in the Galactic magnetic field. The free-free emission is the thermal 
bremsstrahlung from hot electrons when accelerated by ions in the interstel¬ 
lar gas. The observed dust emission is the sum over the emission from each 
dust grain along the line of sight (dust grains in our galaxy are heated by 
the interstellar radiation field, absorbing UV and optical photons and re¬ 
emitting the energy in the far infrared). In addition, there is some contro¬ 
versial about the presence of an anomalous foreground at microwave frequen¬ 
cies j2El Ei ESI EH 123 that could be due to the emission of spinning dust 
grains [SSI El 

The thermal SZ effect [SHlCni is a spectral distortion of the blackbody spec¬ 
trum of the CMB produced by inverse Compton scattering of microwave pho¬ 
tons by hot electrons in the intracluster gas of a cluster of galaxies. In addition, 
the radial peculiar velocities of clusters also produce secondary anisotropies 
in the CMB via the Doppler effect, known as the kinetic SZ effect jSSj. For a 
review on the SZ effect, see m 

The thermal SZ effect has a distinct spectral signature. It produces a tem¬ 
perature decrement below 217 GHz and an increment above that frequency. 
The change in intensity (see Fig. 0) is given by 


AI = 


2{kTof 

{hcY (e"-l)2^" 


X coth-4 

2 


X = 


hv 

m, 


( 1 ) 


where yc = / dl T^Ue is the Compton parameter and is a function of the 

electron density Ue and temperature Tg. This distinct frequency dependence 
can be used in multifrequency observations to separate the thermal SZ effect 
from the rest of components of the microwave sky and, in particular, from the 
CMB. 

The Doppler shift induced by the kinetic SZ effect in the CMB temperature 
fluctuations is given by: 
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Fig. 3. Frequency dependence of the thermal SZ effect (in arbitrary units) 
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( 2 ) 


where Vr is the radial velocity of the cluster and t = aT J ngdl is the optical 
depth. 

The thermal and kinetic SZ effects imprint anisotropies in the CMB at 
scales below a few arcminutes. Therefore, they are compact sources whose 
shape is given by the convolution of the beam of the experiment with the 
cluster profile. It is anticipated that will be very difficult to detect the kinetic 
SZ effect and to separate it from the cosmological signal. This is due to the 
fact that it has the same frequency dependence as the CMB (since it is just a 
Doppler shift). Moreover, it is a very weak effect, around one order of magni¬ 
tude lower than the thermal effect. The SZ effect is a very useful cosmological 
probe. Future SZ surveys will allow one to obtain very valuable information 
about some of the cosmological parameters, such as Hq, f2m, and ug (for 
a review see ini) 

Emission from extragalactic point sources is an important contaminant for 
high-resolution CMB experiments. By point source is meant that the typical 
angular size of these objects is much smaller than the resolution of the ex¬ 
periment (which is usually the case in CMB observations) and therefore they 
appear in the data as point-like objects convolved with the beam of the instru¬ 
ment. There are two main source populations: radio sources, which dominate 
at lower frequencies (< 300 GHz) and far-IR sources which give the main con¬ 
tribution at higher frequencies (> 300 GHz). These populations consist mainly 
of compact AGN, blazars and radio loud QSOs in the radio and of inactive 
spirals galaxies in the far-IR. Different models for the radio IHHl uni EHI ESI 
and infrared oninnicini point source populations have been proposed. How¬ 
ever, there are still many uncertainties with regard to the number of counts 
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and the spectral behaviour of these objects, due to a lack of data at the fre¬ 
quency range explored by CMB experiments. Therefore experiments such as 
Planck will provide with unique information to understand the astrophysical 
processes taking place in these populations of sources. An additional prob¬ 
lem is the heterogeneous nature of extragalactic point sources, since, among 
other complications, each source has its own frequency dependence. Therefore 
they can not be treated as a single foreground to be separated from the other 
components by means of multifrequency observations. 

There are basically two different approaches to perform component separa¬ 
tion. The first one tries to reconstruct simultaneously all the components of the 
microwave sky whereas the second one focuses in just one single component. 
The first type of methods include the Wiener filter PIIHT]. maximum-entropy 

method g^EniizniiniiHiizz] and blind source separation gllSSESlEllIiniEH! 

These methods usually assume that the components to be reconstructed can 
be factorised in a spatial template times a frequency dependence (but see [HIt] 
for a recent work where this assumption is not necessary). This assumption 
is correct for the CMB and the SZ effects but it is only an approximation for 
the Galactic foregrounds. In addition, point sources can not be factorised in 
this way and therefore these techniques are not well-suited for extracting this 
contaminant. Regarding the second approach, it consists on methods designed 
to extract a particular component of the sky. For instance, the blind EM algo¬ 
rithm of m or the internal linear combination of [m EH ESI try to recover 
only the CMB component. Moreover, this type of approach is especially useful 
for the detection of localised objects such as extragalactic point sources or the 
SZ effects. In these lectures we will describe some of these methods, that have 
been developed with the aim of extracting compact sources from microwave 
observations. 

3 Techniques for extraction of point sources 

The most common approach to detect point sources embedded in a back¬ 
ground is probably linear filtering. A linearly filtered image w{x) is obtained 
as the convolution^ of the data y{x) with the filter ip{x): 



( 3 ) 


Note that those parts of the data that resemble the shape of the filter will be 


enhanced in the filtered map. Therefore, the filter should have a similar profile 
to that of the sought signal. Equivalently, we can work in Fourier space: 



( 4 ) 


^Strictly speaking, the filtered image can be written as a convolntion provided 
the filter is linear and spatially homogeneous (see e.g. I4fil l 
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where f{q) denotes the Fourier transform of /. From the previous equation, 
we see that the filter favours certain Fourier modes of the data. 

In principle, it is equivalent to perform the filtering in real or Fourier 
space. However, from the practical point of view, direct convolution is a very 
CPU-time consuming operation. Therefore, working in Fourier space, where 
a simple product is performed, is preferred. 

Different linear filters have been proposed in the literature to detect point 
sources in CMB maps, including the matched hlter m. the Mexican hat 
wavelet [El EH Em , the scale adaptive filter (71 EH, the biparametric scale 
adaptive filter m or the adaptive top hat filter uni. In addition, non-linear 
techniques have also been proposed, such as the Bayesian method of m or 
the non-linear fusion of PHIM]. In the next subsections we give an overview 
of some of these techniques, including applications to CMB simulated data. 

3.1 The matched filter 

Let us assume that we have a signal of amplitude A at position Xg embedded 
in a background of dispersion a. The amplification A of the signal obtained 
with a hlter is given by: 

^ _ w(xo)/a^ 

~ A!a ^ ' 

where w(x()) is the value of the hltered map at the position of the source 
and Gw is the dispersion of the hltered map. Therefore, if the amplihcation is 
greater than one, the contrast between the signal and the background has been 
increased in the hltered map, improving the chances of detecting the source 
with respect to the original data. This is the main idea behind hltering: it 
puts you in a better position to detect the sources. 

The matched filter (MF) is dehned as the linear hlter that gives maximum 
amplihcation of the signal. As an example, we will outline how to construct 
the MF for a source s(a;) with spherical symmetry (a more detailed derivation 
can be found e.g. in ES!)- Let us consider a set of 2-dimensional data y{x)-. 

y{x) = s(a;) -I- n{x) 

s{x) = At{x) (6) 

where x is a 2-dimensional vector of position and x = \x\. The source is 
characterised by a (spherically symmetric) prohle t{x) and an amplitude A = 
s(0). n(x) is the noise (or background) contribution which, for simplicity, is 
assumed to be a homogeneous and isotropic random held with zero mean and 
characterised by a power spectrum P{q) {q = |q|), i.e., 

{n{q)n*{q')) = P{q)S'^{q - q') (7) 

where n[q) is the 2-dimensional Fourier transform. 

Let us introduce a hlter ft with spherical symmetry. The hltered held w is 
given by: 
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w{x) = / y{q)i;{q)e-^‘i^dq 


( 8 ) 


It can be shown that the filtered field at the position of the source (for 
simplicity we will assume that the source is at the origin) is given by: 


w{0) =2 -k qs{q)ip{q)dq, 


(9) 


whereas the variance of the filtered field is obtained as: 

= 27r y qP{q)'ip‘^{q)dq (10) 

We want to find the filter that satisfies the following two conditions: 

1. (w(0)) = A 

2. is a minimum with respect to the filter tp 

The first condition means that the filter is an unbiased estimator of the am¬ 
plitude of the source and gives straightforwardly the constraint: 

J qT{q)ip{q)dq = ^ (11) 


In order to minimise the variance of the filtered map (condition 2) including 
the previous constraint, we introduce a Lagrange multiplier A: 


^(V’) = + A 


qT{q)'tp{q)dq 


1 

2-k 


( 12 ) 


Taking variations with respect to tp and setting the result equal to zero, we 
find the matched filter: 


ip{q) = k 


k = 


Tjq) 

m 

2tt f 




(13) 

(14) 


Note that the matched filter is favouring those modes where the contribution 
of the signal (r) is large and that of the noise (P) is small. 

Assuming simple models for the Galactic foregrounds, |X2| has given an 
estimation of the catalogue of point sources that Planck will produce. Accord¬ 
ing to this work, the number of sources detected by Planck above a 5cr level 
in a sky area of 8 sr will range from around 650 for the 70 GHz channel to 
around 38000 at 857 GHz. 

The matched filter has also been applied for the detection of point sources 
in the Ist-year WMAP data m Using this technique, a catalogue of 208 
extragalactic point sources has been provided. 
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3.2 The Mexican Hat Wavelet 

Wavelet techniques are very versatile tools that only recently have been ap¬ 
plied to the analysis of CMB maps. The main property that makes wavelet 
transforms so useful is that they retain simultaneously information about the 
scale and position of the image. This means that we can study the structure 
of an image at different scales without loosing all the spatial information (as 
it occurs in the case of the Fourier transform). There is not a unique way to 
construct a wavelet transform (see e.g. Eansi). In this section we will focus 
on one particular wavelet, the Mexican Hat Wavelet (MHW), that has been 
successfully implemented for the detection of point sources with a Gaussian 
profile in CMB simulated observations ^1 El IHSI ■ The MHW is the second 



Fig. 4. The Mexican Hat Wavelet in 2 dimensions. 


derivative of the Gaussian function (see Fig. ID): 



and in Fourier space is given by: 

^(g) oc {qRf exp (16) 

where R is the scale of the MHW, a parameter that determines the width of the 
wavelet. Note that ~ wavelet functions in general - are compensated, 
i.e., the integral below the curve is zero. When filtering the data with the 
MHW, this property helps to remove contributions of the background with a 
scale of variation larger than the one of the wavelet. 






10 


R. B. Barreiro 


The method to detect point sources is based on the study of the wavelet 
coefficients map (i.e. the image convolved with the MHW) at a given scale. 
Those wavelet coefficients above a fixed threshold are identified as point source 
candidates. The reason why this works well, it is because point sources are 
amplified in wavelet space. This can be easily seen in Fig. 0 which shows a 
graphical example of the performance of the MHW for a simulation of the 
Planck 857 GHz channel. Three of the panels correspond to the dust emission 


Dust 


Total 



Point Sources 





Wavelet Coefficients 


Fig. 5. An example of the performance of the MHW for a simulation of the Planck 
857 GHz channel (see text for details). 


(top-left), which is the dominant contaminant at this frequency, the emission 
of the extragalactic point sources (top-right) and the total emission at this 
frequency (bottom-left) including the Galactic foregrounds, GMB, SZ effect, 
extragalactic point sources and instrumental noise. The last panel (bottom- 
right) shows the total emission map after convolution with a MHW at a certain 
scale Rq. It is clear that in the wavelet coefficients map a large fraction of the 
background has been removed and that the signal of the point sources has 
been enhanced. Therefore, the detection level in wavelet space is greater than 
the detection level in real space: 


. A 

o-uj(R) cr 


(17) 
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where uj{R) is the wavelet coefficient at scale R at the position of the source, 

is the dispersion of the wavelet coefficients map, A is the amplitude of the 
source and a is the dispersion of the real map. 

Note that the amplification (dehned as the ratio between the detection 
level in wavelet space and the detection level in real space) depends on the 
wavelet scale R. In fact, for a given image, there exists an optimal scale Rq 
that gives maximum amplification for the point sources and that can be de¬ 
termined from the data. For a point source convolved with a Gaussian beam 
of dispersion at, the value of the wavelet coefficient w(i?) at the position of 
the source is given by 

w(i?) = 2RV^A -(18) 

whereas the dispersion of the wavelet coefficients map at scale R is 

aliR) = 21, R^ j P{q)\il{qR)\\dq ( 19 ) 

where P{q) is the power spectrum of the background. Taking the previous 
expressions into account, one can obtain the optimal scale i?o by maximising 
the amplihcation A versus R. Fig|H|shows the amplification of the signal versus 
the scale for simulated observations of the Planck Low and High Frequency 
Instruments (LFI and HFI). Note that the optimal scale is close to Ub- This is 
expected, since this is the scale that characterises the source, but the value of 
i?o will also depend on the background. For instance, if the noise contribution 
is more important at scales smaller than the one of the source, this will tend 
to move the optimal scale to values greater than ab and vice versa. Although 
the MHW will produce, in general, slightly less amplihcation than the MF, 
it has the advantage of being an analytical function, which greatly simplihes 
the use of this technique. 

The procedure to detect point sources is as follows. First, the optimal scale 
i?o is obtained and the data are hltered with a MHW of scale i?o- Those pixels 
k above a given threshold are identihed as point sources. The amplitude of 
the detected sources is then estimated using a multiscale fit for each pixel 
k: 

xl = EKfc - - ^Ik) (20) 

where V is the covariance matrix between the different scales i,j and 

correspond, respectively, to the theoretical (given by equation and 
observed wavelet coefficient at scale i and position k. Four scales are used for 
the x^. Note that this fit can also be used to discard point source candidates: 
if a source candidate at pixel k does not have an acceptably low value of x^ 
could be rejected as a point source. 

The MHW technique has been implemented to deal with simulated Planck 
observations in flat patches of the sky m and also on the whole sphere m- 
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Fig. 6. Amplification (in arbitrary units) versus wavelet scale R (in units of the 
beam dispersion) for a typical region of the sky (the 100 GHz channel of the Low 
Frequency Instrument has been withdrawn from the final Planck payload due to 
financial shortage). 


The procedure to detect point sources on spherical data is very similar to the 
one outlined before, but, in this case, the Spherical Mexican Hat Wavelet is 
used to convolve the data, which is given by m 


'Psiy^R) = , _ _ 

V^N{R) 

NiR) = R(l + ^ 




( 21 ) 


y is the distance to the tangent plane which is given hy y = 2 tan where 9 is 
the latitude angle. Another important point when dealing with spherical data 
is the estimation of the optimal scale. In CMB observations, the contaminants 
can be very anisotropic in the sky. This means that the properties of the 
background change significantly for different areas of the sky and therefore 
the optimal scale for filtering with the SMHW has to be obtained locally. 
This is simply done by projecting small patches of the sky on a plane and 
obtaining the optimal scale for each of them. The maps are then convolved 
with SMHW of different scales and the detection is performed on the sphere 
but with the corresponding i?o for each patch. 
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Tabled shows the catalogue predicted by [HSI using simulated Planck ob¬ 
servations of the whole sky. In addition to CMB and point sources, Galactic 
foregrounds, thermal SZ and instrumental noise were included in the sim¬ 
ulations. Using the recovered catalogue, mean spectral indices can also be 
estimated with good accuracy. The SMHW has also been adapted to deal 
with realistic asymmetric beams as those expected for the Planck Mission. 


Table 1. Predicted point source catalogue using the SMHW from simulated Planck 
observations. The point sources have been detected outside a Galactic cut that varies 
from channel to channel (from no cut at the lowest frequency channels up to a 
Galactic cut with & = 25 degrees for 857 GHz). The different columns correspond 
to: Planck frequency channel, number of detections (above the minimum flux), mini¬ 
mum flux, mean error, mean bias, number of optimal scales needed for the algorithm 
and completeness of the catalogue above the minimum flux (see m for details). 


Frequency 

(GHz) 

Number 

Min. Flux (Jy) 

A(%) 

b{%) 

Nr, 

Completeness 

(%) 

857 

27257 

0.48 

17.7 

-4.4 

17 

70 

545 

5201 

0.49 

18.7 

4.0 

15 

75 

353 

4195 

0.18 

17.7 

1.4 

10 

70 

217 

2935 

0.12 

17.0 

-2.5 

4 

80 

143 

3444 

0.13 

17.5 

-4.3 

2 

90 

100 

3342 

0.16 

16.3 

-7.0 

4 

85 

70 

2172 

0.24 

17.1 

-6.7 

6 

80 

44 

1987 

0.25 

16.4 

-6.4 

9 

85 

30 

2907 

0.21 

18.7 

1.2 

7 

85 


It is also interesting to note that the MHW technique has been combined 
with the maximum-entropy method |88| . Using Planck simulated data it was 
shown that the joint method improved the quality of both the reconstruction 
of the diffuse components and the point source catalogue. 

Finally, we would like to point out that the MHW has been used for the 
detection of objects in X-ray images I221E3I and, more recently, in SCUBA P 
ESI and Boomerang data m 

3.3 The Neyman-Pearson detector and the biparametric scale 
adaptive filter 

As mentioned in section ^ filtering helps the detection process because it 
amplifies the sought signal over the background. However, whether we filter 
or not, we still need a detection criterion - the detector - to decide if a given 
signal belongs to the background or to a true source. In addition, the final 
performance of the filter will clearly depend on the choice of the detector. A 
criterion that has been extensively used in Astronomy is thresholding, i.e.. 
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those pixels of the data above a given value (e.g. 5a) are identified as the 
signal. Thresholding has a number of advantages, including its simplicity and 
the fact that it has a precise meaning in the case of Gaussian backgrounds 
in the sense of controlling the probability of spurious detections. However, it 
only uses a limited part of the information contained in the data, the intensity, 
to perform decisions. 

An example of detector - based on the Neyman-Pearson rule - that takes 
into account additional information has been recently proposed in piniiin3 
for 1-dimensional signals and m for the 2-dimensional case. The first step 
of the procedure is to identify maxima as point source candidates. To decide 
then whether the maxima are due to the presence of background on its own 
or to a combination of background plus source, a Neyman-Pearson detector 
is applied, which is given by (for 2-dimensional signals): 


L(z/, K,e) = 


n{v, K, e) 
nh{v, K, e) 


> 


( 22 ) 


where nb{h', k, e)dxdiydKde is the expected number of maxima of the back¬ 
ground in the intervals {x, x+dx), {v, v+du), (k, K-t-dw) and (e, e-|-de), whereas 
n{y, K,e)dxdvdKde corresponds to the same number in the presence of back¬ 
ground plus source, k and e are the normalised intensity, normalised cur¬ 
vature and normalised shear of the field respectively, and x is the spatial 
variable. For a homogeneous and isotropic Gaussian background and point 
sources with spherical symmetry, it can be shown that the previous detector 
is equivalent to m 

Lp(y, k) = av -\- bK > (23) 

where a and b are constants that depend on the properties of the background 
and the profile of the source, and is a constant that needs to be fixed. 
Therefore, if the considered maximum satisfies p > p*, we decide that the 
signal is present, otherwise we consider that the maximum is due to the pres¬ 
ence of only background. 

Using this detector, m compares the performance of different filters. In 
order to do this, p^f is fixed to produce the same number of spurious sources for 
all the filters and then the number of true detections are compared. In their 
study they consider the matched filter, the Mexican hat wavelet, the scale 
adaptive filter and the biparametric scale adaptive filter (BSAF). In addition, 
the scale of the hlter is allowed to vary (similarly to what is done in the 
Mexican hat wavelet technique) through the introduction of a parameter a. 
For the case of a background with a scale-free power spectrum {P{q) oc q~'^) 
and a source with Gaussian profile, the BSAF is given by: 

il’BSAFiq) = N{a)z'^e'' ^^(1 -|- cz^), z = qaR (24) 


where c is a free parameter, a and c are optimised in order to produce the 
maximum number of true detections given a fixed number of spurious detec¬ 
tions. For c = 0 and a = 1 the MF is recovered. Using this approach the 
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performance of the considered filters has been studied for the case of a Gaus¬ 
sian white noise background. The results predict that, in certain cases, the 
BSAF can obtain up to around 40 per cent more detections than the other 
filters. Although in CMB observations the background is not usually domi¬ 
nated by white noise, this case can also be of interest to detect point sources 
on CMB maps that have been previously processed using a component sep¬ 
aration technique such as the maximum-entropy method |88| . In this case, 
the expected contribution of foregrounds and CMB is subtracted from each 
of the frequency maps, leaving basically the emission of extragalactic point 
sources and white noise (as well as some residuals). In this type of maps, the 
application of this technique could be useful. In any case, a test of the BSAF 
on realistic CMB simulations would be necessary to establish how well this 
approach would perform on real data. 

3.4 Bayesian approach to discrete object detection 

Many of the standard techniques for the detection of point sources are based 
on the design of linear filters. However, other methods - usually more compli¬ 
cated - are also possible. For instance, m has recently proposed a Bayesian 
approach for the detection of compact objects. The method is based on the 
evaluation of the (unnormalised) posterior distribution Pr{9\D) for the pa¬ 
rameters 0 that characterise the unknown objects (such as position, amplitude 
or size), given the observed data D. The unnormalised posterior probability 
is given in terms of the likelihood Pr(T)|d) and the prior Pr(0) as: 

R?{e\D) = Pr{D\0)Pr{e) (25) 

Two different strategies are proposed for the detection of compact sources: 
an exact approach that tries to detect all the objects present in the data 
simultaneously and an iterative - much faster - approach (called McClean 
algorithm). In both cases an estimation of the parameters of the sources as 
well as their errors are provided. For both methods, a Markov-Chain Monte- 
Carlo technique is used to explore the parameter space characterising the 
objects. 

As an illustration of the performance of the method, m studies the per¬ 
formance of both algorithms for a simple example that contains 8 discrete 
objects with a Gaussian profile embedded on a Gaussian white noise back¬ 
ground. The test image has 200x200 pixels and the signal-to-noise ratio of the 
objects ranges from 0.25 to 0.5. Using the exact method, where the number of 
objects is an additional parameter to be determined by the algorithm, all the 
objects are detected with no spurious detections. However two of the objects 
(which overlapped in the noiseless data) are identified as a single detection. 
The results show that the parameters have been estimated with reasonably 
good accuracy. Unfortunately, although this method seems to perform very 
well, it is also very computationally demanding, what can make it unfeasible 
in many realistic applications. 
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The iterative approach tries to detect the objects one-by-one. This signif¬ 
icantly reduces the CPU-time necessary for the algorithm whereas provides a 
convenient approximation to the exact method. For the simple example previ¬ 
ously considered, the McClean algorithm provides quite similar results to the 
exact approach, with only one less object detected than the exact approach. 

This approach is certainly very promising and can be very useful for the 
detection of compact sources in future CMB data. However it assumes the 
knowledge of the functional form of the likelihood, the prior of the parame¬ 
ters and the profile of the objects, which will not be known in many realis¬ 
tic situations. In addition, the presence of anisotropic contaminants (such as 
instrumental noise or Galactic foregrounds) would introduce additional com¬ 
plexity that would make the algorithm more computationally demanding. m 
gives some hints on how to deal with some of these problems. In any case, it 
would be very useful to test the performance of the method under realistic 
conditions in order to establish the real potentiality of the technique. 


4 Techniques for extraction of the thermal SZ effect 

Another important application of the compact source extraction techniques 
is the detection of the thermal SZ effect due to galaxy clusters in microwave 
observations. The resolution of most CMB experiments (e.g. 5 arcminutes for 
the best Planck channels) is usually not enough to resolve the structure of 
the clusters of galaxies. Thus, the SZ emission appears in the CMB maps as 
compact sources whose shape is given by the convolution of the beam of the 
experiment with the profile of cluster. This means that most of the techniques 
used for the detection of extragalactic point sources could be easily adapted 
to detect the SZ effect (in one single map), just by including the correct profile 
of the sought source in the algorithm. This type of studies have been done for 
instance for the SAF US], the MF or the Bayesian approach [^II discussed 
in the previous section. 

However, the thermal SZ effect has a very characteristic frequency signa¬ 
ture that can be used to extract this emission, provided that multifrequency 
observations are available. One alternative to recover the SZ emission is to 
apply a component separation technique - such as the maximum-entropy 
method, Wiener filter or blind source separation - that tries to recover si¬ 
multaneously all the different emissions of the microwave sky. The second 
possibility is to design specific methods to extract the SZ signal that make use 
of the multifrequency information mEoiizii. Regarding this second strategy, 
we will describe two of the methods that have been devised for the extrac¬ 
tion of the thermal SZ effect from multifrequency CMB observations. Both 
methods have been tested using Planck simulated data. 
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4.1 Filtering techniques 

[IH] presents different filtering techniques for the detection of SZ clusters in 
multifrequency maps. Two alternative strategies are proposed: a combination 
technique and the design of a multifrequency filter (or multifilter). In both 
cases, the spatial profile of the clusters are assumed to be known. In the com¬ 
bination method, the individual frequency maps are linearly combined in an 
optimal way. The weights of the linear combination can be determined from 
the data and they are optimal in the sense of giving the maximum amplifica¬ 
tion of objects that have the required spatial profile and the correct frequency 
dependence. This combined map is then filtered either with the MF or with the 
SAF constructed taking into account the characteristics of this new map. In 
the second approach, each frequency map is filtered separately but the filters 
are constructed taken into account the cross-correlations between frequency 
channels as well as the spectral dependence of the SZ effect. Then, the filtered 
maps are added together. This second method can also be implemented for 
two different kind of filters: the matched multifilter and the scale-adaptive 
multifilter. m performs a comparison of all these techniques finding that the 
matched multifilter provides the best results. Another interesting point is that 
the combination method is appreciably faster than the multifilter technique, 
whereas it still detects a large fraction of the clusters found by the matched 
multifilter. 

Taking these results into account, we will discuss in more detail the 
matched multifilter (MMF) approach^. Let us consider a set of N observed 
maps given by: 

y^{x.) = f^Ar^ix.)+n^{:x.), v=l,...,N (26) 

where, for illustration purposes, it is assumed that the SZ signal is due to 
the presence of a single cluster located in the origin of the image. The first 
term of the right hand side describes the contribution of the sought signal 
(the thermal SZ effect in this case) whereas is a generalised noise term 
that includes the sum of all the other components present in the map. fi, is 
the frequency dependence of the thermal SZ effect (normalised to be 1 at a 
reference frequency), Tjy is the shape of the cluster at each frequency (i.e. the 
profile of the cluster convolved with the corresponding antenna beam) and 
A is the amplitude of the SZ effect at the reference frequency. For simplic¬ 
ity the profile of the cluster is assumed to be spherically symmetric and is 
parameterised by a characteristic scale - the core radius Tc - but a generalisa¬ 
tion to more complex profiles can be easily done. The background is assumed 
to be a homogeneous and isotropic random field with zero mean value and 
cross-power spectrum Pviv^isi) defined as 

(nt.i(qX^(q')) = Tli^2(9)^l>(q-q')> 9 = |q| (27) 


^A similar technique has also been independently developed by EH, but in the 
context of the detection of extragalactic point sources in multifrequency observations 
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where is the Fourier transform of and is the 2-dimensional 

Dirac distribution. 

The MMF is given (in matrix notation) by 

T{q) = aP-^F, a-^ = J dqF^p-^F (28) 

where F is the column vector F = [f^Tu] and is the inverse of the cross¬ 
power spectrum matrix P = [Pt/u/a]- 

The output map, where the detection of the SZ effect is finally performed, 
is obtained by filtering each frequency map with its corresponding filter and 
then adding together all the filtered maps. The detection is then performed by 
looking for regions (5 or more pixels) of connected pixels above a 3cr threshold. 
The maximum of the region determines the position of the cluster. In addition, 
the MMF is constructed so that the value of the intensity of the output map in 
the position of the source is an unbiased estimator of the amplitude. Therefore, 
the estimated amplitude of the SZ is simply given by the value of the output 
field in the considered maximum. Another interesting point is that the scale of 
the clusters Tc will not be known a priori. To overcome this problem, the data is 
multifiltered using different values of r^. When the scale of the cluster coincides 
with that of the filter, the amplihcation of the signal will be maximum and 
that gives an estimation of the core radius Tc- 

The MMF has been tested on Planck simulated data of small patches 
(12.8° X 12.8°) of the sky containing CMB, thermal and kinetic SZ effects. 
Galactic foregrounds (synchrotron, free-free, thermal dust and spinning dust) 
extragalactic point sources and instrumental noise. The simulated Planck data 
are shown in Fig.|Z| Note that the SZ emission of clusters is completely masked 
by the rest of the components present in the data. 

Fig.IHlshows the input thermal SZ emission included in the simulations and 
the reconstructed SZ map after hltering the data with a MMF of rc=l pixel. 
Clusters with scales similar to the chosen one are clearly visible in the output 
map. m hnds that the mean error in the determination of the position of the 
clusters is around 1 pixel whereas the core radii are determined with an error of 
0.30 pixels. Regarding the determination of the cluster amplitudes, the mean 
error is around 30 per cent for the brightest clusters, whereas there is a bias 
in the estimation of the weakest clusters. This bias can be understood since 
most weak clusters will only reach the detection threshold if they are on top 
of a positive fluctuation of the background, which will lead to an overestimate 
of the amplitude. The bias could be reduced by improving the method of 
amplitude estimation (for instance by performing a ht to the profile of the 
cluster) that was simply given by the value of the maximum of the detection. 
Using this technique it is expected that Planck detects around 10000 clusters 
in 2/3 of the sky. 

An extension of the multifilter technique to spherical data has been carried 
out by m They test the method on realistic simulations of Planck in the 
whole sky that, in addition to the main microwave components, also include 
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Fig. 7. Simulated Planck channels used to test the performance of the MMF. 
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Fig. 8. The input (left panel) and reconstructed (right panel) SZ effect after filtering 
the data with a MMF of = 1 pixel. 

the effect of non-uniform noise, sub-millimetric emission from celestial bodies 
of the Solar system and Galactic CO-line radiation. It is again found that 
the multifilter approach can significantly reduce the background, allowing the 
cluster signal to be detected. 

4.2 Bayesian non-parametric technique 

1201 proposes an alternative method to detect SZ clusters in Planck data. The 
method has been also tested on the simulated data set of Fig.[7| The procedure 
is as follows. First of all, the frequency maps are significantly cleaned from the 
most damaging contaminants. In particular, extragalactic point sources are 
subtracted with the MHW and subsequently the emission from dust and CMB 
is removed using the information of the 857 and 217 GHz channels respectively. 
The next step consists on obtaining a map of the Compton parameter yc 
in Fourier space by maximising, mode by mode, the posterior probability 
P{yc\d). Taking into account Bayes’ theorem, this probability is given by 


P{yc\d) oc P{d\yc)P{yc) 


(29) 


In order to perform this maximisation we need to know the likelihood function 
P{d\yc) and the prior P(]jc)- Since the residuals left in the frequency maps 
are mainly dominated by the instrumental noise, the likelihood can be well 
approximated by a multivariate Gaussian distribution. In addition, one needs 
to assume a form for the prior P{yc)- Using SZ simulations, [3D] finds that 
the prior follows approximately the form P{yc) oc exp(—Ij/cP/Ty^) at each k- 
mode, where Pyc is the power spectrum of the SZ map. Taking these results 
into account one gets, after maximising the posterior probability, the following 
solution for the yc map at each mode: 


dC-^Rt 


(30) 
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where d is the data, R is the response vector (that includes the information 
from the beam at each frequency and the frequency dependence of thermal 
SZ effect) and C is the cross-correlation matrix of the residuals. This result 
coincides with the multifrequency Wiener hlter solution for the Compton yc 
parameter. Note that the recovered j/c map will depend on the assumed power 
spectrum Py^. However, m shows that the final results do not depend sig¬ 
nificantly on the particular choice of P{yc), provided its form satishes some 
general conditions. We would like to remark that this method does not need 
to make any assumption about the profile of the SZ clusters. 

The recovered j/c map after applying this approach to the Planck simulated 
data of Fig. □is shown in Fig.|21 The detection and the estimation of the flux of 



Fig. 9. The input (left panel) and reconstructed (right panel) Compton parameter 
map after applying the Bayesian non-parametric method. 

clusters is performed in the recovered map using the package SExtractor 
which selects connected pixels above a given threshold. Using a 3a threshold, 
this method predicts that Planck will detect around 9000 SZ clusters over 4/5 
of the sky. In addition, it is shown that the flux of the clusters is estimated 
with no significant bias, which is important to carry out cosmological studies 
with the recovered catalogue. 


5 Techniques for extraction of the kinetic SZ effect 

The determination of peculiar velocities of individual clusters through the ki¬ 
netic SZ effect is a very challenging task. This is mainly due to the facts that 
the kinetic SZ emission is very weak - around 1 order of magnitude weaker 
than the thermal SZ effect - and that it has the same frequency dependence 
as the CMB, meaning that both signals can not be separated using only mul¬ 
tifrequency information. In addition, the presence of all the other components 
of the microwave sky and the instrumental noise makes even more difficult to 
detect this tiny signal. On the other side, we should take advantage of some 
characteristics of the kinetic SZ that could help to extract this emission. An 
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important point is to use the available information about the thermal SZ. 
Since both SZ effects are produced by clusters of galaxies, there is a strong 
spatial correlation between them. Multifrequency observations are also impor¬ 
tant to separate the kinetic SZ signal from the components of the microwave 
sky (except from the CMB). In particular, it is useful to consider observations 
at the frequency of 217 GHz, where the contribution of the thermal SZ is ex¬ 
pected to be negligible. Finally, the probability distribution of the kinetic SZ 
signal (expected to be highly non-Gaussian) and its power spectrum are very 
different from the ones of the cosmological signal, what could also be used to 
separate it from the CMB fluctuations. 

Due to the complexity of the problem, only a few methods have been 
proposed and tested with the aim of extracting this emission from microwave 
observations. For instance, m studied the performance of an optimal filter 
(which is actually a matched filter for the cluster profile) to detect the kinetic 
SZ effect, concluding that peculiar velocities could be measured only for a few 
fast moving clusters at intermediate redshift. EH also applied their Bayesian 
algorithm to detect the kinetic SZ effect on CMB simulations at 217 GHz, but 
including only CMB and instrumental noise in the background. They claim 
that their technique is around twice as sensitive as the optimal linear filter. 
An alternative approach has been proposed by |HS| , which makes use of spatial 
correlation between the thermal and kinetic SZ effects. The method is tested 
in ideal conditions, using as starting point a map of the Compton parameter 
and a second map containing only CMB and kinetic SZ emission. In these ideal 
conditions the method provides very promising results. However, a detailed 
study of the performance of the method under realistic conditions should be 
done before establishing the true potentiality of this approach. Recently, m 
tested a modification of the matched multifilter on Planck simulated data for 
the detection of the kinetic SZ effect, that we discuss here in more detail. 

5.1 The unbiased matched multifilter 

If multifrequency information is available, we can also construct a MMF 
adapted to the kinetic SZ emission. As for the thermal SZ, the shape of the 
sought source will be the convolution of the antenna beam with the cluster 
profile but the frequency dependence will now follow that of the kinetic SZ 
emission. However, EHl found that the estimation of the kinetic SZ effect (and 
in fact that of the thermal SZ effect) using the MMF is intrinsically biased. It 
can be shown that this is due to the presence of two signals (the thermal and 
kinematic SZ effects in this case) that have basically the same spatial profile. 
Given the difference in amplitude of both effects, this bias is negligible for the 
case of the thermal SZ but it can be very important for the kinetic one. In 
order to correct this bias, a new family of filters, the unbiased matched mul¬ 
tifilter (UMMF) has been constructed. For the case of the kinetic SZ effect, 
the UMMF is given by |^: 
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J dqr‘p-V 


(31) 


These new multifilter leads to a slightly lower amplihcation of the sources 
than the MMF, but is intrinsically unbiased. The UMMF has been tested us¬ 
ing Planck simulated data of small patches of the sky including CMB, Galac¬ 
tic foregrounds (synchrotron, free-free, thermal and spinning dust) and point 
sources. In order to test just the effect of the intrinsic bias on the estimation 
of the amplitude, simplistic simulations of clusters have been used, and the 
knowledge of the prohle and position of the clusters has been assumed. 

Fig. [TUI shows the normalised histogram of the recovered parameter V = 
{vrUiec)/{ksTe) using the MMF and the UMMF obtained from simulations 
that contained clusters with rc=1.5 arcmin, yc = 10“"^ and V = —0.1. For 
a temperature of the electrons of Te ~ 5keV, this value of V corresponds to 
a radial velocity along the line of sight of Vr — 300 kms”^. As predicted, 
the estimation of the amplitude of the kinetic SZ effect is strongly biased 
when using the MMF. However, this bias is corrected when the data are 
hltered with the UMMF. m also shows that this result remains valid for 
smaller values of yc or V. Unfortunately, the error in the determination of 
peculiar velocities remains very large even for bright clusters. For instance, 
for yc = 10“'* and Tg ~ 5keV the statistical error in the determination of 
Vr is ^ 800 kms”^. This means that Planck will not be able to measure, 
in general, the peculiar velocities of individual clusters, at least using just 
UMMF. Nonetheless, since the UMMF provides an unbiased estimation of Vr, 
it could be possible to measure mean peculiar velocities on large scales by 
averaging over many clusters. 

6 Extraction of statistical information from undetected 
sources 

In some cases the sought signal may be too weak to be individually detected 
or we may have found the brightest sources but we are unable to go down in 
flux. However, in this case, it may still be possible to extract some valuable 
statistical information from the background of sources. In this section we 
briefly discuss some works that have pursued this objective. 

The differential number counts of extragalactic point sources are usually 
parameterised as 


n{S) = kS-^ , S' > 0 


(32) 


where S is the flux of the source and k and y are the normalisation and slope 
parameters respectively. EOl uses the information of the high order moments of 
simulated data containing CMB, noise and residual point sources, to estimate 
these two parameters. m determines k and y by htting the characteristic 
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Fig. 10. Normalised histogram of the V parameter using the MMF (left panel) and 
the UMMF (right panel). The red vertical line indicates the input value of V. 


function of the point sources distribution to an a-stable model. The method 
is tested in the presence of a Gaussian background. These approaches offer 
interesting possibilities for the extraction of statistical information from an 
unresolved background of point sources. However, the study of their perfor¬ 
mance on more realistic conditions - that take into account, for instance, the 
presence of anisotropic noise and non-Gaussian foregrounds - remains to be 
done. 

Another interesting possibility is the study of the bispectrum of undetected 
point sources since this quantity will depend on the characteristics of the 
underlying population of extragalactic point sources [331121 It also provides 
with an estimation of the level of the contamination introduced by the residual 
point sources. In particular, estimated the power spectrum of residual 
point sources in the WMAP data through the measurement of the bispectrum. 

One may also use statistical information of the unresolved sources to iden¬ 
tify which type of emission is present. In particular, presents a detailed 
study of the contribution of the thermal SZ emission and of the extragalac¬ 
tic point sources to the probability distribution of the brightness map. The 
different imprints left by these two emissions would allow one to discriminate 
whether the excess of power found at small scales in some CMB data is due 
to an unresolved background of point sources or to the presence of unresolved 
SZ clusters. An alternative study based on the analysis of the Gaussianity 
of the wavelet coefficients has also been carried out to explain this excess of 
power m 

Regarding the kinetic SZ effect, we have already mentioned that the de¬ 
tection of peculiar velocities of individual clusters will be a very challenging 
task. However, one could infer bulk flows on large scales what would provide 
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valuable cosmological information. This point has been addressed by different 
authors, including ElIIIEl. 


7 Conclusions 

The development of techniques for the extraction of compact sources in CMB 
observations has become a very relevant and active topic. This is due to the 
necessity of cleaning the CMB maps from astrophysical contaminants that 
would impair our ability to extract all the valuable information encoded in the 
cosmological signal but also because the recovered catalogues of point sources 
and/or SZ clusters would contain themselves extremely relevant astrophysical 
and cosmological information. 

An important effort has been done in the last years towards the develop¬ 
ment of more powerful and sophisticated tools to extract compact sources. 
Many of them have been tested on simulated Planck observations, showing 
their potentiality. However some important work still remains to be done. First 
of all, in some cases, quite ideal conditions have been assumed. For instance, 
it is commonly assumed that the cluster profile is known but, in general, this 
will not be the case for real data. Other methods have been applied to simu¬ 
lations that do not include foreground emissions. Therefore, these and other 
problematics - beam asymmetry, extension to the sphere, relativistic effects, 
anisotropic noise, etc. - present on real data should be taken into account 
to establish the true performance of the developed methods. Also, the meth¬ 
ods do not always use all the available information present in the data. For 
instance, if multifrequency observations are available, it would be useful to in¬ 
clude this multifrequency information in the detection of extragalactic point 
sources even if they do not follow a simple well-known spectral law. The final 
and most important step would be to apply these techniques to real CMB 
data (e.g. WMAP) as they become available. 

We would also like to point out that many polarisation experiments are 
currently planned (or already operating) which will provide with a wealth 
of information about our universe m Given the weakness of the cosmolog¬ 
ical signal in polarisation and the current lack of knowledge regarding the 
foreground emissions, a careful process of cleaning of the CMB polarisation 
maps is even more critical than for the intensity case. However no techniques 
have been yet specifically developed to extract compact sources from polari¬ 
sation CMB observations. Therefore it is crucial to extend some of the current 
methods - or to develop new ones - to deal with this type of maps. 

Finally, a very critical issue is to assess which is the impact of possible 
residuals left in the CMB data after applying these techniques [H]. In par¬ 
ticular, it is very important to control the effect of undetected sources, or 
even possible artifacts introduced in the image after subtracting the signals, 
on the estimation of the power spectrum of the CMB. In addition, this pro¬ 
cess should not modify the underlying CMB temperature distribution, since 
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it would impair our ability to perform Gaussianity analyses of the CMB (or 
even lead us to wrong conclusions), which are of great importance to learn 
about the structure formation of our universe. 
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